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Abstract. - Precise analyses of the statistical and scaling properties of galaxy distribution are 
essential to elucidate the large scale structure of the universe. Given the ongoing debate on its 
statistical features, the development of statistical tools permitting to discriminate accurately 
different spatial patterns are highly desiderable. 

This is specially the case when non-fractal distributions have power-law two point correla- 
tion functions, which are usually signatures of fractal properties. Here we review some possible 
methods used in the litterature and introduce a new variable called "scaling gradient". This 
tool and the conditional variance are shown to be effective in providing an unambiguous way for 
such a distinction. Their application is expected to be of outmost importance in the analysis 
of upcoming galaxy-catalogues. 



Understanding the statistical properties of the spatial distribution of matter in the uni- 
verse is a fundamental issue in cosmology and astrophysics. It provides an important tool 
to test the features of cosmological models and it is intimately related to the nature of the 
matter distribution and the dynamical processes which have shaped the present universe. 
During the past twenty years observations have revealed a hierarchy of structures (termed 
large scale structure, LSS): galaxies are grouped in clusters, which in turn appear to form 
larger associations, the superclusters, separated by wide nearly empty regions 

These structures have been characterized mainly through their correlation properties, in 
particular by the two-point correlation function. Such studies have found the presence of 
power law two-point correlations in a wide range of scales. Many authors have interpreted 
such behavior as the signature of a fractal (or even multifractal) [1-4]. However, in many 
cases, the conceptual and practical implications of a fractal distribution have not been really 
considered [4,5]. 
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Table I - Estimates of the galaxy distribution fractal dimension D and of the range (in Mpc h~^) 
over which it extends (if indicated in the corresponding paper). Note that the reported values of D 
are obtained by different methods of measure; for this reason we choose the generic denomination of 
fractal dimension D. * in fact, a multifractal with dimension D2 ~ 1.3 and Do = 2; f due to planes, 
rather than fractal; J homogeneity not evident in the samples analysed; § specific galaxy samples. 



Author 


D 


Range 


Mandelbrot (1975) [3] 


1.3 


■7 


Carpenter (1986) [8] 


2 2.8 


7 


Deng et al. (1988) [9] 


2.0 


? 


Coleman et al. (1988) [4] 


1.4 ~ 1.5 


r < 28 


Peebles (1989) [6] 


1.23 


r < 15 


Martinez et al.(1990) [10] 


1.3''" 


1 < r < 5 


Luo & Schramm (1992) [11] 


1.2 


10 < r < 100 


Provenzale (1994) [12] 


1.2 


r < 4 




2t 


4 < r < 25 


Guzzo (1997) [13] 


1.2 


r < 3.5 




2 - 2.3 


3.5 < r < 20 -30 


Sylos Labini et al. (1998) [14] 


2 


t 


Scaramella et al. (1998) [15] 


< 3 


r < 300 


Wu et al. (1999) [16] 


1.2 - 2.2 


r < 10 




tends to 3 


10 < r < 100 


Martinez (1999) [17] 


2 


r < 15 




3 


r ^ 30 


Pan & Coles (2002) [18] 


2.16 (PSCZ) § 


r < 10 




1.8 (Cfa2) § 


r < 40 



In fact, one of the implications of fractal correlations is that one cannot define the eventual 
crossover length from the usual correlation function. This point has generated a large debate 
in the field [5-7, 13, 14, 16, 17]. In tab.Qlwe present a comprehensive summary of the properties 
of galaxy correlations, as obtained with various methods. The main results are the value of the 
fractal dimension D and the eventual crossover length to a homogeneous distribution (D = 3) . 
The estimation of such a scale varies from 10 to 300 Mpc h^^ [h is a constant « 0.7). 

In Sylos Labini et al. [14] it has been shown that galaxy correlations from different samples 
measured with more general statistical tools are consistent with each other and with a fractal 
dimension D « 2, without a clear detection of any crossover to a homogeneous distribution 
The detection of fractal properties in LSS raised the issue of their origin. Many authors 
have claimed that fractal structures are naturally formed in cosmological N-body simulations 
(e.g. [19]) driven essentially just by gravitational interactions. 

An alternative, very popular model which also tries to explain the power-law correlations is 
the halo model [20]. This model takes also inspiration from the analysis of N-body simulations, 
where small scale structures look like compact, almost spherical, clusters (halos), with little 
inner substructure (but see e.g. [21]) rather than fractal. In this model, two-point power-law 
correlations up to the halo size are due to particles belonging to the same halo. The crucial 
point is that some kind of non-fractal cluster density profiles can give power law two-point 
correlations, like in a fractal distribution. 

In this model, however, one does not expect to see a single power law from scales smaller 
to scales larger than the halo size (few Mpc) (tab.lil [14,22]). The detect ion of a different 
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behaviour of correlations in the two regimes has actually been claimed in [23] . 

There is a essential difference between this view where correlations are due to structures 
with a regular density profile and the fractal one. Although such difference has been noted by 
some authors [24], there has been little attempt to discriminate in a quantitative way which 
picture actually corresponds to the observed distribution, both for the galaxy data and N- 
body simulations. In this letter we clarify this basic problem from a conceptual and practical 
point of view. 

In particular, we show that specific statistical tools related to the three-point correlation 
analysis can be usefully applied to discriminate between the various scenarios. Moreover, 
we define a new concept ( "the scaling gradient" ) which appears particularly suitable in this 
respect. The application of these methods to new, large catalogs will presumably resolve the 
issue of the true statistical properties of the galaxy distribution. 

We start by considering the simple example of a halo characterized by a single power 
law density, firstly explored in 3d by [20,25]. Since then, there has been a large number of 
studies on the halo properties (for a review, see [26] and references therein). Actually, N-body 
simulations show halos with density profiles which can be approximated by a power-law only 
in a range of scales [27]. However, the profile we investigate here retains the relevant statistical 
features of realistic halos. 

Assume a continuous density distribution in d dimensions decaying from its center as: 

p{r) = Ar-I^ (1) 

with < /3 < d. 

For simplicity in the following the formulas refer to systems of unit size. Clearly, such a 
system is not a fractal: there is only one density singularity, at the origin, and the distribution 
is analytical everywhere else. Its density-density correlation (or conditional average density [7] ) 
can be estimated analytically: 

^•"•> ^ ^ L ^''^'^w-'X'' + ^ + h ^] } ■ 

(2) 

where p(r') is the density in r', p is the average density, the average (. . .) is performed over 
the angles between r and r' and over r', and V{r) is the volume of a sphere of radius r. 

Eq. shows that for (3 < d/2 the first term in curly brackets dominates; therefore the 
average conditional density is constant, as in a homogeneous density field. For (3 > d/2, 
instead, the second term dominates and the average conditional density is a power law with 
exponent d — 2j3 at any scale. This behavior appears therefore identical to the one of a fractal 
sample with dimension D — 2d — 2/3. In fig. ^ we show that a halo and a fractal can have 
precisely the same scaling in r*(r), even though they are completely different systems. 

In the light of this result, however, there has been little effort to clarify the difference 
between the two possibilities in the analysis of LSS data and in N-body computer simulations 

In principle, a distinction between different sets of points with the same two point correla- 
tion properties could be obtained using box counting methods [28] . For the system described 
by eq. we have: 

X{q) = l^E^^? = lim(i?i(/3,g)/''(^-« -t-B2(/3,<7)?'*('-^)), (3) 

where / is the box size, fj,i is the mass inside the box i and the sum extends over all the 
boxes; i?i(/3, q) and B2{l3, q) are constants, depending on /3 and q, but not on I and xiq) is the 
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Fig. 1 - Top: left, 2d halo with 10000 points and density given by eq. Q with [3 = 1.5; right, a 
fractal distribution with fractal dimension D — 1 generated by a Levy-flight algorithm (8000 points). 
Bottom: Average conditional density for the set in the top left frame {empty circles) and the Levy 
flight fractal {filled circles). It is apparent that the two distributions have the same scaling in r*(r). 
The density proflle of the halo with f3 = 1.5 {dashed line) is steeper than the corresponding r*(r). 
The r*(r) for a halo with (3 = 0.5 is shown by the solid line. 



partition function. From eq. Q it is easy to find the multifractal spectrum for the system: 
ioi q < ^, a — d and /(a) — d; for q > ^ a = d ~ [3 and f{a) = 0. The exponent a describes 

the scaling of the mass inside a box of size Z as ^ ^ 0, and N cx l^f^°''> is the number of such 
boxes. Such results reveal a homogeneous {/{o:) = d) distribution of boxes whose average 
density p{l) oc I" /l"^ — /V^ is constant and a finite (/(a) — 0) set of boxes (in this case only 
one), whose average density scales as p{l) cx 

The multifractal analysis, therefore, correctly detects the presence of the central singularity 
and of an analytic distribution everywhere else. However, if we consider a system described 
by eq. JQl, but made of discrete set of points, the identification of scalings by box counting 
analysis is no more straightforward. Since the system is not uniform, the local interparticle 
distance A is a function of the distance from the center r: \{r) = (^A~^r^Y^'^ . It is easy to 
see that, if one considers boxes of size I > lo = A^^l"^ (where A is the amplitude in eq. Ql), 
they are occupied on average. If, on the other hand, I < lo, one can define a characteristic 
distance To from the center by the equation A(ro) — I. The boxes at distances r > To contain 
on average one or no particles, while the boxes at r < r,, are on average occupied. In other 
words, there is a /-dependent scale below which the system is analogous to the continuum 
case, and above which the system looks intrinsically discrete. 

A major difference between a fractal (or a multifractal) and a halo described by eq. 
is the fact that, in the fractal, the density fluctuations are large at any scale. In the halo, 
instead, the density varies smoothly. A valid candidate to quantify such fluctuations is the 
conditional variance, defined as the mean square density fluctuation in spheres centered on 
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points of the system, normalised to the average conditional density (eq. [29]: 



iPRirm 



(4) 



where PR{r) is the density in a sphere centered in R with radius r, and the subscript c means 
that the corresponding quantities are "conditional". In particular, (p_R(r)^)c, where the aver- 
age (...)c is performed over all occupied points, can actually be rewritten as {pR{rQ)pR{r)pR{r)) 
where the average (...) is performed over all the fx, in the volume. In turn, {pR{ro)pR{r)pR{r)) 
is actually the three point correlation function {pR(ri)pR{rj)pR(rK)) with n = rj. This shows 
that {pR{r)'^)c is in fact closely related to the three-point correlation function 

In general, for a point distribution, (T^{r) will be given by the sum of two terms: (j'^(r) = 
<Jp{r) + crfir), where ap{r) ~ {{PR{'''))cV{r))~^ is the variance due to Poissonian noise and 
(j1{r) is the intrinsic variance of the system, which depends on its specific properties. 




Fig. 2 - Normalized conditional variance (7f(r) for the samples described in fig. Q Solid line with 
squares: fractal; empty circles: halo. Solid line: theoretical result from eq. © 

It is possible to compute af (r) for a cluster described by eq. (0) : 



d-(3 



^ ) + Jd^ 



\ d-213 



„d-2p 



WW 



i-2/3 



(5) 



From eq. ||SJ it is easy to see that for (3 > d/2, af{r) oc r^^"^. On the contrary, since a 
fractal is a scale invariant structure, crf{r) (often referred to as lacunarity [29-31]) is constant. 
In fig. |3we plot erf (r) for a fractal and a halo, together with the analytic result of eq. lO). 

In addition to the conditional variance we introduce a new statistical concept, the "scaling 
gradient" A, which permits also a local analysis of the fluctuations. 

Consider a point distribution in d dimensions extending over a finite volume. The volume 
is divided in N{,ox identical boxes of size I, with the number of occupied boxes being Nocc{l)- 

We identify all the adjacent pairs of occupied boxes Ni{l), where i runs over all the occupied 
adjacent boxes, Nadj_occ- Each box i of the occupied ones is divided in Ng(i) identical boxes 
(offsprings); some of these will be occupied and we denote them as Ng_occ{i)- Ns_occ{i) is 
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Fig. 3 - Measure of the scaling gradient A{1) as a function of the box size I for different samples in 3d. 
Empty circles: a fractal with dimension D — 2 generated by a Levy flight algorithm. Filled circles: a 
halo with (3 = 2. Triangles: a homogenous set. In the inset, the procedure followed to measure A{1) 
in Id: two adjacent occupied boxes (the boxes with filled circles) are divided in two offsprings each. 
The offsprings of the left box are both occupied (p = 1); while just one of the offsprings of the right 
one is occupied (p = 1/2). The resulting A is 1/2. 

the number of occupied offsprings in the box i and let us define pi = Ns_occ{i)Ns{i)^^ as the 
fraction of occupied offsprings of the box i. 

The scaling gradient of the system is defined as: 



where the sum extends over all pairs of adjacent occupied boxes Nadj_occ- This measure has 
the following properties: 

(i) it is a conditional measure, since it only considers occupied adjacent pairs; 

(ii) it considers the occupation density p, which is a measure of how the occupation of the 
boxes scales in the system; 

(iii) it is sensitive to local fluctuations of p, although it is averaged over the whole system. 
In other words, the scaling gradient measures the fluctuations of the fragmentation prop- 
erties of the system. 

The results of a measure of A(/) in three different 3d samples are shown in fig. 01 While 
the measure of A(^) for the homogeneous set and the halo shows a peak at a characteristic 
scale, the fractal distribution has a flat A(Z). 

The behavior of A{1) for the halo can be explained as follows. For I such that ro{l) = 
{Al^y^^ » 1 all boxes and their "offsprings" are occupied: in this case, A(/) 2± 0. When / is 
such that ro{l) < 1, all the boxes are occupied, but some of their "offsprings" (with distance 
from the center r ~ 1) will be empty. Therefore A{1) grows. Eventually, when I is such 
ro{l) — 1, all the boxes will be occupied on average. Consider now the generation of box 
offsprings in this case: their size is such that a large number of them is empty. In particular, 
it is the maximum number of empty boxes deriving from occupied boxes. For this reason, 
A(/) reaches a maximum. This is apparent in fig. 13 On the contrary, since a fractal is a scale 
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invariant structure, A(/) is constant at all scales larger than the lower cut-off. The scaling 
gradient is therefore able to detect unambiguously the scaling properties of different systems 
characterised by the same two-point correlations. 

In summary, N-body simulations provide evidence for the formation of halos, clusters 
which are not really fractals, but still are characterized by power law correlations. The galaxy 
distribution, instead, appears more compatible with the fractal behaviour in a range of scales. 
We have addressed the fundamental issue of the discrimination between the two distributions 
in such a way to offer a series of tools which permit clarification of this problem. This 
requires going beyond the two point correlations, although with a careful critical analysis. 
For example, we show that the multifractal approach is not suitable in this respect. The 
conditional variance is more appropriate for the global properties at large scales, but for the 
more relevant case of local scaling, we introduce the new concept of "scaling gradient" . These 
methods and their critical analysis will represent a crucial element for extracting the relevant 
statistical properties in future large galaxy catalogues and N-body simulations. 

* * * 

We are grateful to prof. M. A. Munoz for careful reading and stimulating comments on 
the manuscript. 
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